Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Multi-source data parallel preprocessing method based on similar connection
GUO Fangfang, CHAO Luomeng, ZHU Jianwen
Journal of Computer Applications    2019, 39 (1): 57-60.   DOI: 10.11772/j.issn.1001-9081.2018071869
Abstract413)      PDF (587KB)(250)       Save
With the development of large-scale network environments and big data-related technologies, traditional data fusion analysis technology faces new challenges. Focusing on poor flexibility and low processing efficiency in current multi-source data fusion analysis process, a multi-source data parallel preprocessing method based on similar connection was proposed, in which the idea of dividing and conquering and paralleling was adopted. Firstly, the preprocessing method was improved to increase the flexibility by unifying similar semantics in multi-source data and retaining personality semantics. Secondly, an improved parallel MapReduce framework was proposed to improve the efficiency of similar connections. The experimental results show that the proposed method reduces total data volume by 32% while ensuring data integrity. Compared with traditional MapReduce framework, the improved framework decreases 43.91% of time consumed; therefore, the proposed method can effectively improve the efficiency of multi-source data fusion analysis.
Reference | Related Articles | Metrics